Dimensionality Reduction Through Unsupervised Features Selection
نویسندگان
چکیده
As the storage technologies evolve, the amount of available data explodes in both dimensions: samples number and input space dimension. Therefore, one needs dimension reduction techniques to explore and to analyse his huge data sets. Many features selection approaches have been proposed for the supervised learning context, but only few techniques are available to address this issue in the unsupervised learning context. Actually, the problem of unsupervised feature selection becomes more difficult as the samples points’ labels disappear. Thus, most of the methods proposed rely on feature correlations and only pairs of variables are considered. In this paper, we extend the wkmeans algorithm proposed by Huang to the self-organizing maps (SOM) framework and we propose a feature selection approach which relies on the weighting coefficients learned during the optimization process. This SOM-based approach addresses the difficult issue of unsupervised feature selection and is ready to handle high dimensional data sets.
منابع مشابه
Combining Unsupervised Variable Selection with Dimensionality Reduction
This paper bridges the gap between variable selection methods (e.g Pearson coefficients, KS test) and dimensionality reduction algorithms (e.g PCA, LDA). Variable selection algorithms encounter difficulties dealing with highly correlated data, as many features are similar in quality. Dimensionality reduction algorithms tend to combine all variables, and are not able to select significant variab...
متن کاملSteel Consumption Forecasting Using Nonlinear Pattern Recognition Model Based on Self-Organizing Maps
Steel consumption is a critical factor affecting pricing decisions and a key element to achieve sustainable industrial development. Forecasting future trends of steel consumption based on analysis of nonlinear patterns using artificial intelligence (AI) techniques is the main purpose of this paper. Because there are several features affecting target variable which make the analysis of relations...
متن کاملDimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)
This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...
متن کاملGraph Autoencoder-Based Unsupervised Feature Selection with Broad and Local Data Structure Preservation
Feature selection is a dimensionality reduction technique that selects a subset of representative features from highdimensional data by eliminating irrelevant and redundant features. Recently, feature selection combined with sparse learning has attracted significant attention due to its outstanding performance compared with traditional feature selection methods that ignores correlation between ...
متن کاملA Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...
متن کامل